Example Coursework

<-- These indicate important points, they would not be written up

Hypothesis 1

“Pupils who live closer to school are lighter than those living a large distance away.” <-- The Hypothesis
This is because people living a distance away will get a lift in whereas people living closer will walk. <-- The reason for your hypothesis
correlation_example.JPG
I expect there to be a positive correlation and the graph to look like this. <-- Expectations

I calculated a stratified sample for the school. <-- Sample technique
This involves calculations to get a "representation" of the whole school by keeping each subgroup in the same ratio.

whole school = 1200 pupils
year 7 boys have 150 pupils so 150/1200=0.125 and 0.125 x 30 3.75 which is about 4 <--- calculations
year 7 boys have 150 pupils so 150/1200=0.125 and 0.125 x 30 3.75 which is about 4

year 8 boys 145/1200x30=3.625=4
year 8 girls 125/1200x30=3.125=3

year 9 boys 120/1200x30=3
year 9 girls 140/1200x30=3.5=4

year 10 boys 100/1200x30=2.5=3
year 10 girls 100/1200x30=2.5=3

year 11 boys 84/1200x30=2.1=2
year 11 girls 86/1200x30=2.15=2

total of 32 so i will remove the extra boy in year 8 and the extra girl in year 9 <-- reason for ommitions

I will use the random button key on my calculator to choose my sample and if there is a pieve of data that seems wrong it will be ignored and the random key used again. I will just use the first three digits of the random number for each yeargroup as there are over 100 of them and will use the two digits for data with less than one hundred items. I will ignore any random numbers over the number of items in the set of data.

The resulting graph looks like this:
scatter_both.JPG
The anomalous data is making the graph too small. I will ignore that data and redraw the graph.
scatter_both_2.JPG
There is a very small amount of correlation but nothing that will allow a line of best fit. I will split them and see if there is a better correlation with male and female pupils.




Hypothesis 2

“there is a correlation between the distance traveled and weights for male and female pupils”
I expect the two graphs to show negative correlation. I am splitting the males and females as metabolisms are different so separate graphs will show up and connection.

scatter_girl.JPGscatter_boy.JPG
It would seem there is little correlation between travel and weight thought the boys do show stronger correlation than the girls. There are many reasons why this is could be so. Some pupils who live close still take the car and we have not taken into account height and metabolism and the data chosen might just work better for boys than girls. <-- Possible reason for results


external image 5aa69d11142e11a759432e16f7d21970.png
Using the above formula we can achieve a value which indicates correlation.
1 and -1 are perfect positive and negative correlation wheras 0 is no correlation

Using spearman's rank correlation coefficient we get the following rank tables
(the first one for girls and the second one for boys)
spearmans_girls.JPGspearmans_boys.JPG

According to this the coefficient is 0.09286 for girls which is positive
though it is close to zero so it has positive correlation but is very weak
The boys fair better as they have a coefficient of 0.36964 which is better but still weak

Hypothesis 3

“the average weight of boys and girls is the same but there will be a bigger range of weight for girls than boys.”
I feel this to be the case as some girls are more conscientious of their weight and so there will be more fluctuations and variety in weights.

I expect the box plot to look like this:
box_plot_example.JPG
To do this i will need to complete two cumulative frequency curves and then generate box plots from these. I will also need to put together some grouped data tables.

girls_cumulative_frequency.JPGboys_cumulative_frequency.JPG
cumulative_both.JPG

The box plot looks like this:
box_plot_both.JPG
It would seem that the average weight for boys is higher than girls and there is a bigger spread of weights. THis means my hypothesis is wrong. The higher median and quartiles are higher. This could be due to boys maturing later and growing up during secondary school time wheras girls have already started to mature by the time they get there.


Hypothesis 4

“The distribution of boys will be weighted towards the heavier end whereas girls will be more 'normally' distributed.”
I feel that, due to the previous information, that girls will more mormally distributed as the range will be smaller. The boys on the other hand reach puberty during secondary school and so body weight increases quickly so there should be more boys who have put on weight.

I expect the histograms to have this type of shape.
girls_histogram.JPGboys_histogram.JPG
I will construct histograms using sensible class intervals.
Histograms use a relative frequency to dictate the height of the bars.
This is calculated by dividing the frequency by the range of the intervals.
This means the intervals do not all have to be the same.

boys_histogram_table.JPG girls_histogram_table.JPG

histogram_girls.JPGhistogram_boys.JPG
It seems that both girls and boys have similar modal weights
but the boys skew to the right resulting in a larger average than the girls who skew to the left.
histogram_year_7_girls.JPGhistogram_year_7_boys.JPG

histogram_year_11_girls.JPGhistogram_year_11_boys.JPG
It would appear that girls and boys in year 7 have very similar distributions
For year 11 girls remain in a narrow pattern whereas boys have a much larger range of weights.
The mode for year 11 seems to be the slightly higher for boys than girls for both year groups.

I am now going to calculate the standard deviation of each of the four data sets.
This should show us how far each of the pupils is away from the mean.
The smaller the standard deviation the closer all pupils are to the mean.

external image 64e7589b2d8fa6814f6d38f06cd3b43b.png

standard_deviation_year_7_girls.JPGstandard_deviation_year_7_boys.JPGstandard_deviation_year_11_girls.JPGstandard_deviation_year_11_boys.JPG

As you can see from the calculations the year 7 boys and girls and the year 11 girls have small standard deviations
which means most of the pupils fall very close to the mean.
Year 11 boys however have a greater range from the mean.
This confirms the histograms where year 11 boys had a much bigger range.



Conclusion

There is little to no correlation with weight and distance from school with boys showing slight correlation.
(This would be due to various methods of getting to school and not due to further distance meaning more exercise)

The boys weights are bigger than girls and have a greater interquartile range.

The boys follow a more normal distribution of weights and the girls are slightly skewed to the right.

The standard deviation for Yr7 and Yr11 Girls is significantly smaller than the Yr11 Boys.